Goto

Collaborating Authors

 telemetry data


NetBurst: Event-Centric Forecasting of Bursty, Intermittent Time Series

arXiv.org Artificial Intelligence

Forecasting on widely used benchmark time series data (e.g., ETT, Electricity, Taxi, and Exchange Rate, etc.) has favored smooth, seasonal series, but network telemetry time series -- traffic measurements at service, IP, or subnet granularity -- are instead highly bursty and intermittent, with heavy-tailed bursts and highly variable inactive periods. These properties place the latter in the statistical regimes made famous and popularized more than 20 years ago by B.~Mandelbrot. Yet forecasting such time series with modern-day AI architectures remains underexplored. We introduce NetBurst, an event-centric framework that reformulates forecasting as predicting when bursts occur and how large they are, using quantile-based codebooks and dual autoregressors. Across large-scale sets of production network telemetry time series and compared to strong baselines, such as Chronos, NetBurst reduces Mean Average Scaled Error (MASE) by 13--605x on service-level time series while preserving burstiness and producing embeddings that cluster 5x more cleanly than Chronos. In effect, our work highlights the benefits that modern AI can reap from leveraging Mandelbrot's pioneering studies for forecasting in bursty, intermittent, and heavy-tailed regimes, where its operational value for high-stakes decision making is of paramount interest.


A Transformer-Based Conditional GAN with Multiple Instance Learning for UAV Signal Detection and Classification

arXiv.org Artificial Intelligence

Unmanned Aerial Vehicles (UAVs) are increasingly used in surveillance, logistics, agriculture, disaster management, and military operations. Accurate detection and classification of UAV flight states, such as hovering, cruising, ascending, or transitioning, which are essential for safe and effective operations. However, conventional time series classification (TSC) methods often lack robustness and generalization for dynamic UAV environments, while state of the art(SOTA) models like Transformers and LSTM based architectures typically require large datasets and entail high computational costs, especially with high-dimensional data streams. This paper proposes a novel framework that integrates a Transformer-based Generative Adversarial Network (GAN) with Multiple Instance Locally Explainable Learning (MILET) to address these challenges in UAV flight state classification. The Transformer encoder captures long-range temporal dependencies and complex telemetry dynamics, while the GAN module augments limited datasets with realistic synthetic samples. MIL is incorporated to focus attention on the most discriminative input segments, reducing noise and computational overhead. Experimental results show that the proposed method achieves superior accuracy 96.5% on the DroneDetect dataset and 98.6% on the DroneRF dataset that outperforming other SOTA approaches. The framework also demonstrates strong computational efficiency and robust generalization across diverse UAV platforms and flight states, highlighting its potential for real-time deployment in resource constrained environments.


Transforming Decoder-Only Transformers for Accurate WiFi-Telemetry Based Indoor Localization

arXiv.org Artificial Intelligence

Wireless Fidelity (WiFi) based indoor positioning is a widely researched area for determining the position of devices within a wireless network. Accurate indoor location has numerous applications, such as asset tracking and indoor navigation. Despite advances in WiFi localization techniques -- in particular approaches that leverage WiFi telemetry -- their adoption in practice remains limited due to several factors including environmental changes that cause signal fading, multipath effects, interference, which, in turn, impact positioning accuracy. In addition, telemetry data differs depending on the WiFi device vendor, offering distinct features and formats; use case requirements can also vary widely. Currently, there is no unified model to handle all these variations effectively. In this paper, we present WiFiGPT, a Generative Pretrained Transformer (GPT) based system that is able to handle these variations while achieving high localization accuracy. Our experiments with WiFiGPT demonstrate that GPTs, in particular Large Language Models (LLMs), can effectively capture subtle spatial patterns in noisy wireless telemetry, making them reliable regressors. Compared to existing state-of-the-art methods, our method matches and often surpasses conventional approaches for multiple types of telemetry. Achieving sub-meter accuracy for RSSI and FTM and centimeter-level precision for CSI demonstrates the potential of LLM-based localisation to outperform specialized techniques, all without handcrafted signal processing or calibration.


Unsupervised anomaly detection in large-scale estuarine acoustic telemetry data

arXiv.org Artificial Intelligence

Acoustic telemetry data plays a vital role in understanding the behaviour and movement of aquatic animals. However, these datasets, which often consist of millions of individual data points, frequently contain anomalous movements that pose significant challenges. Traditionally, anomalous movements are identified either manually or through basic statistical methods, approaches that are time-consuming and prone to high rates of unidentified anomalies in large datasets. This study focuses on the development of automated classifiers for a large telemetry dataset comprising detections from fifty acoustically tagged dusky kob monitored in the Breede Estuary, South Africa. Using an array of 16 acoustic receivers deployed throughout the estuary between 2016 and 2021, we collected over three million individual data points. We present detailed guidelines for data pre-processing, resampling strategies, labelling process, feature engineering, data splitting methodologies, and the selection and interpretation of machine learning and deep learning models for anomaly detection. Among the evaluated models, neural networks autoencoder (NN-AE) demonstrated superior performance, aided by our proposed threshold-finding algorithm. NN-AE achieved a high recall with no false normal (i.e., no misclassifications of anomalous movements as normal patterns), a critical factor in ensuring that no true anomalies are overlooked. In contrast, other models exhibited false normal fractions exceeding 0.9, indicating they failed to detect the majority of true anomalies; a significant limitation for telemetry studies where undetected anomalies can distort interpretations of movement patterns. While the NN-AE's performance highlights its reliability and robustness in detecting anomalies, it faced challenges in accurately learning normal movement patterns when these patterns gradually deviated from anomalous ones.


Combined Hyper-Extensible Extremely-Secured Zero-Trust CIAM-PAM architecture

arXiv.org Artificial Intelligence

Customer Identity and Access Management (CIAM) systems play a pivotal role in securing enterprise infrastructures. However, the complexity of implementing these systems requires careful architectural planning to ensure positive Return on Investment (RoI) and avoid costly delays. The proliferation of Active Persistent cyber threats, coupled with advancements in AI, cloud computing, and geographically distributed customer populations, necessitates a paradigm shift towards adaptive and zero-trust security frameworks. This paper introduces the Combined Hyper-Extensible Extremely-Secured Zero-Trust (CHEZ) CIAM-PAM architecture, designed specifically for large-scale enterprises. The CHEZ PL CIAM-PAM framework addresses critical security gaps by integrating federated identity management (private and public identities), password-less authentication, adaptive multi-factor authentication (MFA), microservice-based PEP (Policy Entitlement Point), multi-layer RBAC (Role Based Access Control) and multi-level trust systems. This future-proof design also includes end-to-end data encryption, and seamless integration with state-of-the-art AI-based threat detection systems, while ensuring compliance with stringent regulatory standards.


VORTEX: A Spatial Computing Framework for Optimized Drone Telemetry Extraction from First-Person View Flight Data

arXiv.org Artificial Intelligence

This paper presents the Visual Optical Recognition Telemetry EXtraction (VORTEX) system for extracting and analyzing drone telemetry data from First Person View (FPV) Uncrewed Aerial System (UAS) footage. VORTEX employs MMOCR, a PyTorch-based Optical Character Recognition (OCR) toolbox, to extract telemetry variables from drone Heads Up Display (HUD) recordings, utilizing advanced image preprocessing techniques, including CLAHE enhancement and adaptive thresholding. The study optimizes spatial accuracy and computational efficiency through systematic investigation of temporal sampling rates (1s, 5s, 10s, 15s, 20s) and coordinate processing methods. Results demonstrate that the 5-second sampling rate, utilizing 4.07% of available frames, provides the optimal balance with a point retention rate of 64% and mean speed accuracy within 4.2% of the 1-second baseline while reducing computational overhead by 80.5%. Comparative analysis of coordinate processing methods reveals that while UTM Zone 33N projection and Haversine calculations provide consistently similar results (within 0.1% difference), raw WGS84 coordinates underestimate distances by 15-30% and speeds by 20-35%. Altitude measurements showed unexpected resilience to sampling rate variations, with only 2.1% variation across all intervals. This research is the first of its kind, providing quantitative benchmarks for establishing a robust framework for drone telemetry extraction and analysis using open-source tools and spatial libraries.


Machine learning-driven Anomaly Detection and Forecasting for Euclid Space Telescope Operations

arXiv.org Artificial Intelligence

State-of-the-art space science missions increasingly rely on automation due to spacecraft complexity and the costs of human oversight. The high volume of data, including scientific and telemetry data, makes manual inspection challenging. Machine learning offers significant potential to meet these demands. The Euclid space telescope, in its survey phase since February 2024, exemplifies this shift. Euclid's success depends on accurate monitoring and interpretation of housekeeping telemetry and science-derived data. Thousands of telemetry parameters, monitored as time series, may or may not impact the quality of scientific data. These parameters have complex interdependencies, often due to physical relationships (e.g., proximity of temperature sensors). Optimising science operations requires careful anomaly detection and identification of hidden parameter states. Moreover, understanding the interactions between known anomalies and physical quantities is crucial yet complex, as related parameters may display anomalies with varied timing and intensity. We address these challenges by analysing temperature anomalies in Euclid's telemetry from February to August 2024, focusing on eleven temperature parameters and 35 covariates. We use a predictive XGBoost model to forecast temperatures based on historical values, detecting anomalies as deviations from predictions. A second XGBoost model predicts anomalies from covariates, capturing their relationships to temperature anomalies. We identify the top three anomalies per parameter and analyse their interactions with covariates using SHAP (Shapley Additive Explanations), enabling rapid, automated analysis of complex parameter relationships. Our method demonstrates how machine learning can enhance telemetry monitoring, offering scalable solutions for other missions with similar data challenges.


A Digital Twin Framework for Liquid-cooled Supercomputers as Demonstrated at Exascale

arXiv.org Artificial Intelligence

The framework enables the study of "what-if" scenarios, system optimizations, and virtual prototyping of future systems. Using Frontier as a case study, we demonstrate the framework's capabilities by replaying six months of system telemetry for systematic verification and validation. Such a comprehensive analysis of a liquid-cooled ex-ascale supercomputer is the first of its kind. ExaDigiT elucidates complex transient cooling system dynamics, runs synthetic or real workloads, and predicts energy losses due to rectification and voltage conversion. Throughout our paper, we present lessons learned to benefit HPC practitioners developing similar digital twins. We envision the digital twin will be a key enabler for sustainable, energy-efficient supercomputing.


Integrating Biological Data into Autonomous Remote Sensing Systems for In Situ Imageomics: A Case Study for Kenyan Animal Behavior Sensing with Unmanned Aerial Vehicles (UAVs)

arXiv.org Artificial Intelligence

In situ imageomics leverages machine learning techniques to infer biological traits from images collected in the field, or in situ, to study individuals organisms, groups of wildlife, and whole ecosystems. Such datasets provide real-time social and environmental context to inferred biological traits, which can enable new, data-driven conservation and ecosystem management. The development of machine learning techniques to extract biological traits from images are impeded by the volume and quality data required to train these models. Autonomous, unmanned aerial vehicles (UAVs), are well suited to collect in situ imageomics data as they can traverse remote terrain quickly to collect large volumes of data with greater consistency and reliability compared to manually piloted UAV missions. However, little guidance exists on optimizing autonomous UAV missions for the purposes of remote sensing for conservation and biodiversity monitoring. The UAV video dataset curated by KABR: In-Situ Dataset for Kenyan Animal Behavior Recognition from Drone Videos required three weeks to collect, a time-consuming and expensive endeavor. Our analysis of KABR revealed that a third of the videos gathered were unusable for the purposes of inferring wildlife behavior. We analyzed the flight telemetry data from portions of UAV videos that were usable for inferring wildlife behavior, and demonstrate how these insights can be integrated into an autonomous remote sensing system to track wildlife in real time. Our autonomous remote sensing system optimizes the UAV's actions to increase the yield of usable data, and matches the flight path of an expert pilot with an 87% accuracy rate, representing an 18.2% improvement in accuracy over previously proposed methods.


The OPS-SAT benchmark for detecting anomalies in satellite telemetry

arXiv.org Artificial Intelligence

Detecting anomalous events in satellite telemetry is a critical task in space operations. This task, however, is extremely time-consuming, error-prone and human dependent, thus automated data-driven anomaly detection algorithms have been emerging at a steady pace. However, there are no publicly available datasets of real satellite telemetry accompanied with the ground-truth annotations that could be used to train and verify anomaly detection supervised models. In this article, we address this research gap and introduce the AI-ready benchmark dataset (OPSSAT-AD) containing the telemetry data acquired on board OPS-SAT -- a CubeSat mission which has been operated by the European Space Agency which has come to an end during the night of 22--23 May 2024 (CEST). The dataset is accompanied with the baseline results obtained using 30 supervised and unsupervised classic and deep machine learning algorithms for anomaly detection. They were trained and validated using the training-test dataset split introduced in this work, and we present a suggested set of quality metrics which should be always calculated to confront the new algorithms for anomaly detection while exploiting OPSSAT-AD. We believe that this work may become an important step toward building a fair, reproducible and objective validation procedure that can be used to quantify the capabilities of the emerging anomaly detection techniques in an unbiased and fully transparent way.